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Abstract 

We introduce regenerative tree growth processes as consistent families of random trees with n 
labelled leaves, n > 1, with a regenerative property at branch points. We establish necessary 
and sufficient conditions on the growth rules under which we can apply results by Haas and 
^ Miermont to establish self-similar random trees and residual mass processes as scaling limits. 

This framework includes all growth processes for exchangeably labelled Markov branching 
trees, as well as non-exchangeable models such as the alpha-theta model, the alpha-gamma 
model and all restricted exchangeable models previously studied. A key result is a represen- 

■ tation of the growth rules with a er-finite dislocation measure extending Bertoin's notion of 
an exchangeable dislocation measure from the setting of homogeneous fragmentations. 
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1 Introduction to regenerative tree growth processes 

For each n > 1, denote by T n the set of rooted leaf-labelled combinatorial trees with no degree- 

2 vertices and n + 1 degree- 1 vertices, one of which is called the root, the others leaves. We 
distinguish the leaves by labels 1, ... ,n. Vertices of degree 3 or higher are called branch points. 
Consider a family T n , n > 1, of random trees in T n , n > 1. For n > 2, we refer to the vertex 
adjacent to the root as the first branch point. It induces the first split, a random partition 
n n = (II ni i, . . . , H n: K n ) of the label set [n] := {1, . . . , n} into the label sets of the subtrees above 
the branch point, the connected components of the tree with the first branch point removed. Here, 
we put the blocks Ii n) i of H n in the order of their least elements. For illustration, we write 

TiHYh T 2 = iy<>, T 3 = ^y V© V®, ^\ etc., 



in 



o 

(N 



>< 




where we have ordered subtrees by their least labels to uniquely choose plane tree representatives. 

We suppose that the family (T n ,n > 1) is consistent in the sense that removal of leaf n + 1 
(and the resulting degree-2 vertex, if any) from T n+ \ yields T n . Reversing this removal gives a tree 
growth step from n to n + 1. A consistent family (T n ,n > 1) is called a tree growth process. For 
BC [n], let Tb be the set of trees with #B leaves labelled by B, so that Tj n ] = T n . Let T n ^s £ 
be the reduced subtree of T n spanned by the root of T n and leaves in B, and let T Ut B € ^[#B] be 
the image of T n ^B after relabelling of leaves by the increasing bijection from B to [#B]. 

Definition 1 We call a tree growth process (T n ,n > 1) regenerative if for each n > 2, condi- 
tionally given that the first split of T n is H n = (B%, . . . ,Bk), the subtrees T nt B t , 1 < % < k, are 
independent copies of T# ^ . 

In the terminology of [S], the trees in a regenerative tree growth process as defined here are 
"consistent labelled Markov branching trees" . The exchangeable case, where the distribution of T n 
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Figure 1: Illustration of the regenerative tree growth step 

is invariant under all permutations of labels, was initiated by Aldous [2], who posed the problem 
of providing a Kingman-type representation in this case. Bertoin's [3] theory of homogeneous 
fragmentations solved that problem as explained in [T3] . Then |15[ [TB] studied regenerative tree 
growth processes associated with fragmentation processes. Natural non-exchangeable tree growth 
processes were described in terms of simple growth rules that admit regenerative descriptions 
based on the first split and its subtrees, see particularly [H[T0l[23], as reviewed in Examples [3] 
and S] below. Such regenerative growth rules are a key consequence of Definition [1] and lead to 
the following characterisation (cf. Figure [1]). We leave the straightforward proof to the reader. 

Proposition 2 In the tree growth step from n to n+l for n > 2, there are the following disjoint 
events, G n ^ for i = 0, . . . , K n + 1, where K n > 2 is the number of blocks of the first split of T n : 

• G n fi: leaf n + 1 is attached to a new branch point between the root and the first branch point 
ofT n ; 

• G n> i, 1 < i < K n : label n+l is inserted into the ith block of the first split; 

• G n K n +l : leaf n+l is attached to the first branch point, as singleton block of the first split. 

A tree growth process (T n ,n > 1) is regenerative if and only i/P(G nj o \T n ) = ¥(G Ui q) does not 
depend on T n and P(G nj i | T n ) = ¥(G n> i |n n ), 1 < i < K n + 1, only depends on the partition 
II n of the first split. In the event G n i, 1 < i < K n , label n + 1 is inserted into the ith subtree 
of T n of size #U n ,i following the same rules, up to relabelling by the increasing bijection from 

n nii u{n + i} to [#n n>i + i]. 

We denote by V n the set of partitions tt = {B\, . . . , B^) of [n], with blocks Bi ordered by least 
element. We use notation g n (-K,i) = P(G nj j | Tl n = ir), < i < k + 1, for it ^ lr n i := ([n]), n > 2, 
and write g n (0) = ^(^,0), since we require that this quantity is independent of 7r € V n \ {l[ n l}- 



Example 3 (Alpha-theta model [23J) For < a < 1, 9 > and ?r = (B 1 ,B 2 ) G V n , let 

+ e . oN #B 2 



3n(7T,0) 



a 



5n(7T, 1) 



ffn(vr,2) 



o 



, g n (ir, k) = 0, k > 3. 



n-1 + 9 1 ^ " v 7 ' n-1 + 9 ' J,LX 7 ' n-l + . 
Example 4 (Alpha-gamma model |4j) For < 7 < a < 1 and tt = {B\, . . . , B^) G V n , let 

7 . f _ ,\ #Bi - a . ^ n i ,_ , 1N (k - l)a - 7 



n — a 



5n(vr,i) 



n — a 



,i G [k], 9n{^, k + 1) 



n — a 



In these examples, the regenerative property was shown in |23|. Proposition 11] and [5j Proposition 
8], respectively. They both contain as special case for a = 1/2 and, respectively, 8 = 1/2 
and 7 = 1/2, the exchangeable uniform model on binary trees, related to Aldous's Brownian 
Continuum Random Tree pQ. The uniform model on T n is not consistent, see [22] for weak limits. 
Ford's binary alpha-model [10] is also included in both examples. 
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Let us express the splitting rules p n {^) = P(n n = it) explicitly in terms of {g n -,n > 2). From 
the growth rule, we have for all ir = (Bi, . . . , Bk) £V n \ 

p 2 ({l},{2}) =1, p n+l ([n},{n + l})=g n (0), n>2, 
p n+1 (B 1 , . . .,Bi-\,Bi U {n + ... ,B k ) = p n {Bi, . . . , B k )g n (ir, i), i G [k + 1]. 

Using the natural convention <7i(0) = 1, the solution to these equations can be written as 

n-l 

Pn(vr) = p n {B\, . . . ,B k ) = g m - m B 2 -i(0) J [ 9j(n^,Ij), where Ij = i if j + 1 G (2) 

j'=min B2 

and 7r^'l is the vector of non-empty Bi n [?']. The RHS of this formula is the probability of 
successively creating a new first branch point when mini?2 is added and inserting all higher 
labels such that the resulting partition at the first split is ir. By the regenerative property of T n , 
we can write tree probabilities as a product over branch points; for a tree t G T n , we identify 
each vertex with the set B of labels in the subtrees above this vertex, write n(B) for the partition 
of the split at B, and n(B) for the partition of [#B] obtained when relabelling ir(B) by the 
increasing bijection from B to \_#B\: 

I \ 

P(T n = t)= J] P#B (n(B))= [] Unin5r(B) 9 -l(0) II dM**)® Jj( B )) b ( 3 ) 
Bet:#B>2 Bet:#B>2\ j=mmn{B) 2 J 

where Ij(B) = i if j + 1 G n(B)i, and where tt(B) = (tt(B)i, . . . ,Tr(B) k rm). 

Example 5 (Poisson-Dirichlet model [16, 18J) According to [TS], the only consistent ex- 
changeable model with splitting rules of the Gibbs form 

k 

Pn(vr) = — H w #Bl , vr = (Bi, . . . , B k ) G V n \ l[ n ], for some wj >0,a k > 0, c„ > 0, 
i=i 

is given by a two-parameter family. Most relevant for us are < a < 1 and 9 > — 2a with 
Wj = r(j — a)/r(l — a), j > 1, and = a fc_2 r(A: + 6/a)/T{2 + 0/a), k>2, with normalisation 
constants c n = c a fi{n) satisfying 0^(2) = 1 and c a ^(n + l) = (n + 6)c at e(n) + T(n — a)/T(l — a), 
n > 2. Case a = is a limiting case. These yield growth rules for it = (Bi, . . . , B k ) of the form 

9n(0) = p„+i([n],{n+ 1}) 



r(l - a)c ay e(n + 1) 



/ .x _ Pn+i(^i, • • • , Bj-uBj U {n + 1}, ffj+i, ...,B k ) _ (#Bj - a)c afi {n) . 

p n (Bi,. . . ,B k ) c a .e(n + l) 

(to + 6»)c Qi e(n) 
5„(7r,fc + l) = ? — — — . 

Aldous's binary beta model [2] is included for 6 = —2a. Both the Poisson-Dirichlet model and 
the alpha-gamma model contain as special cases for a G [1/2, 1) and, respectively, 6 = — 1 and 
7 = 1 — a, the exchangeable model related to the stable Continuum Random Tree [6| fTTj [19] . 

In the same way as the Poisson-Dirichlet model, all consistent exchangeable [15] and consistent 
restricted exchangeable [5] labelled Markov branching trees give rise to a regenerative tree growth 
rule. Section [2] develops this in terms of associated dislocation measures. 

In a regenerative tree growth process (T n ,n > 1), we can, for each n > 1, study the residual 

In) 

mass process of label 1, i.e. the evolution of the size Xm in T n of the block containing label 1. 
This process is a Markov chain in m > 1 starting from = n, decreasing to x[ n ^ = #n n i and 
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further according to successive splits until M n = inf{m > 0:Xm = 1}, when label 1 becomes a 
singleton. We represent this Markov chain as a composition of n 

n — ( /nrOlA _ ( v( n ) v( n ) v( n ) v( n ) y(n) y {n) y(n)\ 

Proposition 6 In a regenerative tree growth process, the family (C n ,n > 1) o/ compositions is 
regenerative in the sense that conditionally given Cq^ = k, the composition (c[ n \ . . . , Cj^) o/ 
n — k has the same distribution as C n -k- The entries of the transition probability matrix are 

F(C^ n) = n-k)= P(x{ n) = k) = P«( 7r )' 1 < < n - 1. 

n={B 1 ,...,B k )£V n :#B 1 =k 

This is a straightforward consequence of Definition [TJ We stress that we have consistency in the 
sense that C n can be obtained from C n +i by reducing one part of C n+ \ by 1 (the one corresponding 
to label n+1 in T n +i), but (C n , n > 1) is not sampling consistent in the sense of [11] as this part is 
not a size-biased pick from C n +i, in general. In special cases, versions of this proposition are in 
the literature; in the exchangeable (sampling consistent) case, it is implicit in Bertoin's [3j study 
of tagged particles and explicit in [16]; for the alpha-theta model see pH Proposition 6]. 

The structure of this paper is as follows. In Section[2l we encode regenerative tree growth rules 
(9n, n > 1) in a measure k, generalising Bertoin's [3j notion of a dislocation measure. Sections 
and 2] establish necessary and sufficient conditions on the growth rules under which results by 
Haas and Miermont [T3] apply to give scaling limits for the trees and for the residual mass 
processes in a regenerative tree growth process. Section 5 indicates some open problems and 
possible future work on regenerative tree growth. 

2 Dislocation measures 

Recall the notation V n for the set of partitions tt = (B\, . . . , B^) of [n] = {1, . . . , n} with blocks 
ordered by least element. In the Introduction, we regarded splitting rules p n as distributions on 
^ri\{l[ra]}> where lr„i = ([re]) is the partition that does not split [n]. In this section, we consider 
the set V of all partitions T = (Ti, i > 1) of N, with blocks ordered by least element and Tj = 
if there are fewer than i blocks. For T € V, we denote by € V n the partition T restricted 
to [re], with the non-empty Ti n [re] as blocks. For each tt = (Bi, ...,Bf.) € V n , n > 1, we let 
V* = {r G V:T^ = tt}, the set of partitions of N that restrict to tt on [re]. We equip V with 
the a- algebra generated by {T 1 * ,tt € V n ,n > 1}, which is also the Borel a- algebra generated by 
the metric d(T, T) =exp(— infjre > 1: T^}). For T € V and n > 1, consider the decreasing 

rearrangement |r[ n l|4- = (|r[ Tl J|j', i > 1) of relative frequencies (|r| n '|,i> 1), where |r| n ' | = #r| n '/n. 

If the limit as re — > oo of |r^|| or of |r| n ^| exists, this is denoted by |r|^ and \Ti\, respectively, 
and we say that an asymptotic frequency exists for that part. While the existence of some | or 
\T\jj does not in general imply the existence of any other \Tk\ or |r||, we note the following: 

Lemma 7 Existence of\T\j for all i > 1 holds if and only if(\Ti\,i > 1) exists as a uniform limit. 
In this case, (\T\f,i > 1) is the decreasing rearrangement of (\T{\,i > 1), which we write as \T\^. 

Under the equivalent conditions we say that asymptotic ranked frequencies exist. We leave the 
proof of Lemma[7]as an exercise to the reader. The partition T = ({2* _1 , . . . , 2* — 1}, i > 1) is an 
example where |r»|, i > 1, exists, but |r|{ does not. 

To study convergence to a self-similar tree 7^ in the sense of [12U14j . we will need to identify 
the scaling index 7 > and a measure v on S\ = {s = (si)j>i: si > S2 > ■ ■ ■ > 0, Y^i>i s i = 
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In the context of a regenerative tree growth process, where we have labels and consistency, these 
will be related to a richer structure, a dislocation measure k on V . This k will provide rates 

X n = k({F g V: rW + ([n])}) = «(P \ 

for the first split of [n], n > 2, that allow to consistently embed the evolution of blocks in T n , 
n > 1 , into continuous time (see Theorem [T6|) ; the rate A n of the first split of [n] can then be 
thinned by the event that this split also splits [n— 1], an event with probability 1 — g„_i(0), where 
(9n, n > 2) is the growth rule of the regenerative tree growth process, so that we need 

n— 1 ^ 

A„(l - flW-i(O)) = A n _i, n > 3, and hence A n = A 2 — — tt-, if gj(0) ^ I, j > 2. (4) 

Note that gj(0) = 1 for any j > 2 means that all future insertions are made below the split 
concerned; if scaling limits of (T n ,n > 1) exist at all, the parts of trees above such splits will 
collapse in the scaling as n —> oo. We will exclude such behaviour in the sequel and make the 

Assumption (A) gj(0) < 1 for all j > 2. 

Proposition 8 Consider a regenerative tree growth rule (g n , n > 2) satisfying Assumption (A) 
with splitting rules (p n ,n > 2) given by ([2]), and let A2 > be arbitrary. With \ n , n>3, defined 
by dH), define 

K(T n ) = X n p n (w), vreP n \{l w },n>2; «({1 N }) = 0. (5) 
Then k extends uniquely to a measure on V . 

For the proof, cf. Bertoin's argument [31 Proposition 3.2] in the exchangeable case, based on 
Caratheodory's Extension Theorem. We refer to any measure k on V with k({1^}) = and 
< 00, n > 2, ELS cl dislocation measure. 
We can now condition k on splitting [n] and write p n as 

Pn (TT) = k{V*)/k{V \ V [n] ), ireV n \ {l [n] }, n > 2. (6) 

We also deduce a one-to-one correspondence between k and {{g n ,n > 2),A2): 

Proposition 9 Given a regenerative tree growth rule (g n ,n > 2) satisfying Assumption (A), 
A2 > and k associated via ©, we can recover (g n ,n>2) from k via n > 2, as 

g n (0) = l- T Jl -, 9n(n,i) = y^ — w — g-r , i€[k + l], (7) 

tt = (B\, . . . , Bk) € V n \ 1 [n] - Furthermore, any dislocation measure k yields A2 > and a 
regenerative tree growth rule (g n , n > 2) as given in ([7]), which satisfies Assumption (A). 

We define Kingman's paintbox k s for s G = {s = (sj)j>i: s\ > S2 > ■ ■ ■ > 0, Yli>i s i — 1} as 
the distribution of the random partition IT of N where i,j G N are in the same block if i = j or 
B4 = Rj > 1, where the B4, i G N, are independent random variables with P(i2j = k) = s^, k > 0, 
and where so = 1 — X^j>i s i- r ^^ ie Strong Law of Large Numbers implies that K s -a.e. T G "P has 
asymptotic ranked frequencies |r|"^ = s. 

Example 10 (Exchangeable models [3l 115] ) Bertoin classified all exchangeable dislocation 
measures, i.e. measures that are invariant under the action of permutations of N on V, giving an 
integral representation 

K = Y] cS e(j) + / K s( • )v(ds), 
f>l ^ 



where c > 0, e"') is the partition with blocks {j} and N\{j}, and v is a measure on with 



i/({(1,0,0,...)}) = 0, / (1 - aiMcfe) < oo. (8) 

Then z/ is the push-forward of k under T h-> ir^, restricted to \ {(1, 0, ...)}. 

The splitting rules (p n ,n > 2) associated with Bertoin's exchangeable dislocation measures k 
give rise to the consistent exchangeably labelled Markov branching trees of |15j . 

Example 11 (Poisson-Dirichlet model) Dislocation measures of Ewens-Pitman type 

are exchangeable with 

c = and u(ds) = PD* (ds) = E(<r?; erf 1 Aa [0>1] G ds), 6 > -2a, a G (0, 1), 

where (at,t > 0) is a stable subordinator with Laplace transform E(e _AfTt ) = e~ tx " and A<7[q u 
the decreasing rearrangement of its jumps Aat = at — at-, t G [0, 1], see [161 02] f° r details. The 
associated (g n , n > 2) for the tree growth process were given in Example 

Example 12 (Restricted exchangeable models |5j) Let us define restricted exchangeable 
dislocation measures by their integral representation, referring to [5] for a full discussion: 

k = Ci S eW + (cj5 e u+i) + kjSuW + n s { ■ n V mj+l} )vj{ds)^ 
where Cj > 0, kj > 0, = ([j], {j + 1}, {j + 2}, . . .), and Vj is a measure on satisfying 

^•({(1,0,...),(0,0,...)}) = 0, (^s l {j=1} + J24( 1 - s i)^jM ds ) <°°> 3 > 1- 

This includes all exchangeable dislocation measures, for Cj = c, kj = u({(0, . . .)}), Vj=-u—kjS(n \. 
The splitting rules (p n ,n > 2) associated with restricted exchangeable dislocation measures n 
give rise to the consistent restricted exchangeable labelled Markov branching trees of [5]. 

Example 13 (Alpha-gamma model) The K-measures are restricted exchangeable with 
Cj = kj = 0, and vi{ds) = (1 - a)PD* _ a _ 7 (ds), Vj{ds) = 7PD* _ a _ 7 (ds), j > 2, 
0<7<a<l, see [5]. For the associated regenerative growth rules (g n , n > 2), see Example 01 

Example 14 (Alpha-theta model) This model is not restricted exchangeable and the dislo- 
cation measure is not in previous work. To describe it, we introduce an ordered paintbox K( u ,i-u)i 

< u < 1, as the distribution of LT = ({i > 1: E4 = 1}, {i > 1: B4 = 2}) where R± = 1 and the Ri, 

1 > 2, are independent random variables with P(i?j = l) = u = 1 — P(i?i = 2). For < a < 1 and 
> 0, the K-measure of the alpha-theta model is now given by 

K = aK h e tl( ■ n V®) + 0K h e tU ■ n V^'W), where 1$% = f k {u>1 - u) ( • ^(l - u)-^ 1 ^. 

JO 

To see this, let (g n ,n > 2) be as in Example [3] and recall the convention <?i(0) = 1. From , 

tu n\ h , fl1 , r(#ij 1 -i + g)r(#i? 2 -a) 

^(gi, B,) = («l {2egl} + 01 {2€g2} ) r(y| _ 1 + g)r(1 _ q) , 

for (B\,B2) G V n with #.Bi > 1 and #i?2 > 1> n > 2. The result now follows from ([6]) and the 
fact that K h e ^ a (V Bx > B2 ) = Jo it fl - 1+ # Bl - 1 (l-u)- a - 1+ # B2 du is a beta integral. 

Coming back to the heuristics at the beginning of this section, let us discuss how Bertoin's [3] 
notion of a P-valued homogeneous fragmentation process finds a natural extension where his 
exchangeable dislocation measure is replaced by a dislocation measure in the sense defined above. 



Definition 15 A V- valued process II = (IL(t),t > 0) is called refining if for all s < t and all 
blocks Tlj{t) of n(t), there is a block Ilj(s) of II(s) that contains Ilj(t). For a refining process II, 
we define genealogical trees T n € T n , n > 1, using the representation above ([3]): T n has as branch 
points and leaves all blocks Il| (t), i > 1, t > 0, visited by the restriction II^ of II to [n]. 

Theorem 16 For each dislocation measure k as defined after Proposition 0, there exists a V- 
valued Feller process H = (Tl(t),t > 0) such that the genealogical trees T n of the restrictions II ^ 
o/II to [n], n> 1, form a regenerative tree growth process associated with dislocation measure k. 

Proof. We will use k in a Poissonian construction based on independent "P-valued Poisson 
point processes (Ey'(t),t > 0), i > 1, with intensity measure k. Roughly, we construct II with 
11(0) = lpj such that for all i and t the partition H«(t) fragments the ith block Ilj(t) of II(t) into 
the image E®(£) of E®(i) under the increasing bijection from N to Ilj(t). 

More precisely, we build consistent "P ra -valued continuous-time Markov chains (n[ n l(t),i > 0), 
n > 1, with jump times S^(k) > and jump states M^(k) = (m|™ '(£;), M [ ™\ n] (k)) € V n : 
we set nW(t) = AfW(Jb), ^(fc) < t < ^(jfe + 1), Jfc > 0, where S'N(O) = 0, MW(0) = l [n] , 

SW(k + 1) = inf jt > [#iVf] n ' (fc)] €" S? J (t) for some 1 < i < K^(k)\ 

and, if SH(fc + 1) < oo, let M[ n l(fc + 1) be the partition obtained from M^(k) by replacing 
the ith block by the blocks of E^(S^(k + 1)), the image of E^(S^{k + 1)) n [#Ai i W (A;)] under 
the increasing bijection from [#M] n '(/c)] to M^ n \k). Note that S^(k + 1) = oo if and only if 
M^ n \k) = 0r„i := ({1}, . . . , {n}), as we require A2 = n(V^'^) > for all dislocation measures. 

Since II is uniquely determined by (Il^n > 1), standard properties of Poisson point processes, 
and of the space V complete the proof. □ 

By Proposition [9j growth rules (g n ,n > 2) determine a measure n only up to a multiplicative 
factor A2 > 0. This is reflected in the homogeneous fragmentation processes II of Theorem [16] in 
the fact that the genealogical trees T n , n > 1, are unaffected by (linear) time changes of II. 

From the consistency of (T n ,n > 1), it is clear that there is a unique branch point of T n , where 
1 and 2 are separated into different blocks. Moreover, as n varies, the partitions at this branch 
point define a partition of some random subset of N, whose distribution when relabelled by the 
increasing bijection is described by the splitting rules conditioned on partitions that restrict to 
({1}, {2}), hence by ac( ■ | "pW'i 2 }). In the Poissonian construction, this partition after relabelling 
is S( 1 )(S'[ 2 ](1)). More generally, while there may be Poisson points that do not induce branch 
points of T n , n > 1, e.g. when k is finite or when k can produce blocks of finite size, those points 
rP'\S^ n \k)) used in the Poissonian construction describe the partition at a branch point of T m for 
all m > n. The partition at every branch point, separating labels j and t say, has a distribution 
that is absolutely continuous with respect to k. 

The Poissonian construction formulated here differs from Bertoin's Section 3.1.3] in the 
relabelling by increasing bijections: Bertoin uses IT(i) n instead of H"(i). In the ex- 

changeable case yields the same processes, in distribution. A notable consequence is that under 
assumptions that ensure that there are always infinitely many blocks and that they are all infinite, 
we can recover the (SW,i > 1) from II in our setting. It is now possible to extend substantial 
parts of Bertoin's theory to this extended generality, but we leave the details to the reader. 

3 Scaling limits for regenerative tree growth processes 

Regenerative tree growth processes (T n ,n > 1) give rise to evolutions of delabelled trees T°, 
n > 1. Denote by Q° n the distribution of T° on the set T° of rooted unlabelled trees with n 
leaves, i.e. the push-forward of the distribution of T n under the natural projection map from T n 
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to T° . It is easy to verify that the family (T°, n > 1) has the Markov branching property, in the 
sense that conditionally given a composition (n\, . . . , njt), n\ > • • • > > 1, at the first split, 
the distribution of T° is the same as if we connected independent trees T°j with distribution 
Qn^ I < i < k, &t their roots, cf. [23J Proposition 11]. In terms of a dislocation measure k, the 
probability of a composition (ni, . . . , n^) is given by 

q n ( ni ,...,n k )= Yl P»= E A n = K(P\pW). 

neV n :(#n)i=( ni ,...,n k ) 7r€7' n :(#7r)4-=(n 1 ,...,n fc ) 

Equipped with the graph metric and with the uniform probability measure ("weight measure") 
on the leaves, T° is a weighted metric space. We use notation T°/a to scale the metric to obtain 
a metric space with distance 1/a between any two adjacent vertices, where a £ (0, oo). Haas 
and Miermont [14] studied convergence of T°/n l l(n) for 7 G (0,1) and I slowly varying - in 
the sequel every t will be assumed slowly varying. The sense of convergence is rooted Gromov- 
Hausdorff-Prohorov (GHP) convergence on the space of isometry classes of pointed weighted 
compact metric spaces (see e.g. [SJ El Q21 Ej). The limiting random trees are rooted 7 -self- 
similar trees (T, d, p) equipped with a measure p on the Borel sets of T, which have the property 
that for all t > conditionally given the tree {v £ T: d(p, v) < t} up to height t and given subtree 
masses p(Si(t)) = nii(t), the subtrees Si(t), i > 1, above height t, are like independent copies of 
T, with masses rescaled by rrii(t) and distances rescaled by (mj(t)) 7 . 

Haas and Miermont [12] showed that the process F(t) = (p(Si(t)),i > 1), t > 0, in is a 
7-self-similar fragmentation process in the sense of Bertoin [3], and hence is associated with a 
dislocation measure v on that describes the infinitesimal behaviour of F. Specifically, a unit 
mass F(Q) = 1 undergoes repeated fracturing into smaller masses that evolve independently, with 
a fragment of size m splitting into ms = (msi,ms2, • • •) at rate m^u{ds). It was shown in [12] 
that such self-similar trees exist if v satisfies, for s\ = {s E S^: Sj = 1}, that 

K{(1,0, ...)}) = 0, J (1 - sx)v(ds) < 00, and v (s l \ sty = 0. (9) 

The first two conditions, ([8]), are needed [3] to construct F, whereas the third condition prevents 
loss of mass at splitting events. The latter could in principle be captured in objects similar to 
Haas and Miermont's self-similar trees by allowing the measure p to have atoms, which they 
disallow, but we will here follow them and exclude this case by imposing the third condition. 

While every dislocation measure k on V gives rise to a regenerative tree growth process, 
not every such process has a scaling limit. Examples without scaling limit include the (0,9)- 
tree growth process studied in |23t Proposition 13], where the growth is logarithmic and the 
branching structure degenerates under logarithmic scaling. Given a dislocation measure n with 
A n = k{V\V^) = n~<l{n) regularly varying, as n — > 00, for some 7 € (0, 1), Haas and Miermont's 
[14] convergence condition (H) requires convergence as n — > 00 of 



En 7 £(n)g n (ni,...,n fc ) (1 - — J /(—,...,—, 0, ... J 

m>—>nk-ni-\ Yn k =n 

E «<?*) (1 - f = [ (1 - |rM|{) / (|pM|+) <dr) , ( io) 

where we write (#7r)^ for the decreasing rearrangement of the block sizes of tt, with zeros ap- 
pended. In view of (|10p . it is natural to assume the existence of asymptotic ranked frequencies 
K-a.e. This holds for exchangeable and restricted exchangeable k, and when n is partially ex- 
changeable in the sense of [21]. In our framework, condition (H) only needs checking for / = 1: 

Theorem 17 For a dislocation measure n on V such that K-a.e. T 6 V \ {In} has asymptotic 
ranked frequencies in \ {(1,0, •••)}, define v to be the push-forward of k under T i-> and 
suppose J s i(l — si)v(ds) < 00 and \ n = k(V\V^) = n^lin). Then the following are equivalent: 
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(i) Condition (H) holds, i.e. for all bounded continuous f:S^ — > [0, oo), 

j (l - |r [nl |j) / (|rN|^ ,t(rfr) -+J (1- 8i)f(s)v(da), as n ^ oo; 

(ii) y (|r^|j - |r|j) «(dT) asn^ oo; 

(iii) / |r^|{ — \T\\ K(dT) — t 0, as n — > oo, i.e. i/ie convergence |r[ n l|j — > |r|j ao/ds in Li(ft). 
■In 



7f condition (ii) holds, then 



n-y£(n) 



T-y V in distribution, as n — > oo, in iae G^/P sense. 



Proof. If (i) holds, we obtain (ii) as a rearrangement of the special case / = 1. Now assume (ii). 
Let us prove (iii). For all m > 1, consider p[ m l, all partitions that do not split [m], then 



P 



itN 
1 Ii 



It 
I 1 Ii 



«(dT) 



itN 
I 1 Ii 



It 
I 1 Ii 



«(dT) + 



ir^ij - |r|t k(oT), 



where for each fixed m, the second term vanishes as n — > oo, by dominated convergence, since 

< oo and since K-a.e. T has asymptotic ranked frequencies. The first term is bounded by 

f (l-| rW |fWaT) - f fl-|rM|fWdT) + / (l-\T\\) K (dT) -> 2 / (1- |r| j>(dT), 

as n —7- oo, where the second term converges by the same reasoning as before, and the first by 
condition (ii). Since J p (l - \T\\)K(dT) < oo, while f| m >i ^ H = {In} and k({1 n }) = 0, the 
infimum of these limits over m > 1 vanishes by basic measure theory, and (iii) follows. 
Now assume (iii). If f:S^ — >• [0, oo) is continuous and bounded, then 

J (l - |rW|j) / (| r H|^ K,(dF) - J (i - |r|j)/(|r|^)K(dr) 

< / |rN|j_|r|j /(|rW|4-) K ( d T)+ [ (i-|r|j) f(\v^) K (dr), 
Jv J-p 

and (i) follows from (iii) by dominated convergence, since K-a.e. T has asymptotic ranked fre- 
quencies, since / is bounded and continuous, and since v is the pushforward of k under T i— > |r|~k 
The last part follows from Haas and Miermont |144 Theorem 1]. □ 

Example 18 (Exchangeable models) For exchangeable k = f s ± K s (-)v(ds), it was demon- 
strated in [14] that (H) reduces to the / = 1 case and is easily checked when Q holds and 
A n = n 7 ^(n). Theorem 1171 applies. This includes the Poisson-Dirichlet models of Example [5j 

Example 19 (Alpha-theta model) [14] proved that their conditions hold. However, here is a 
shorter argument: in the notation of Example 1141 we have (as [2] for exchangeable paintboxes) 

j (l - |rM|f) K iu>1 _ u) (dT) < J (l - |r[ nl |) (aT) = 1- Sl if si = u > 1/2, 

and the same holds with si = 1 — u if u < 1/2. But then (H) for / = 1 follows for K^ e ^ a and for 
k which is bounded by a multiple of Kg°^ a a by dominated convergence. Theorem [T7] applies. 
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Example 20 (Restricted exchangeable models) Consider k as in Example fT2| with Cj = 
kj = 0, and with X n = k(V \ V^) = n 7 £(n). The push-forward of k under V i— > \T\^ is given by 



j>l \ i>l ) 



Vj (ds) . 



Assuming ([9]), and also Vj = v m for all j > m for some m > 1, as in Theorem 7] where scaling 
limits were established for convergence in probability, we deduce condition (ii) of Theorem [17] for 



m— 1 r. 

■ n {v \ V [m] )) + Y «b(. n 



m— 1 

K — K v m — K Urr 

from the exchangeable case and by dominated convergence, because on the RHS only the measure 
K v m = Jsi ^s(-) l/ m(^s) is infinite. Theorem PT7l applies. This includes the alpha-gamma model. 



All these examples satisfy the condition (ii), which we need to get convergence from Theorem 1 171 
Let us provide another, very different example that shows what can go wrong. 

Example 21 Some of the most elementary non-trivial dislocation measures are of the form 

K = ^(Xj - Aj-i^ro) for some r (i) e V b ~ 1] ' {j} - 

i>2 

To ensure X n ~ n 7 for some 7 € (0, 1), let X n — A n _i = 7 n 7_1 . For simplicity, we take T(j) binary 
with asymptotic frequencies (x^\ 1 — x^'), where x^' = 1 — 1/j. This implies 

/ (i- Sl Mds)= / (i-|r||) K (dr) = ^(i-x^)(A i -A J _ 1 ) = 7 E^ 2 < 00 ' 

with v as push-forward of k. Consider a 7-self-similar tree T^ )U with dislocation measure v. We 
explore two examples illustrating the validity/violation of condition (ii), which now reads 



f (irM|{ _ ini) K(dv) = j2 (\m [n] \i - (A,- - Aj-i) -> 0, 

i>2 



as n -> 00. 



(a) For j > 2 and x( J ) = 1 — 1/j G (0,1), we construct T(j) as a sequence (r(j)t n ],n > j), 
starting from T(j)^ = ([j — 1], {j}), and using the following step inductively for n > j: 

• Step A x : Given rW, if |rf* ] | > z, set rf*" 11 = rf* ] , otherwise set T l ™ +1] = rt n] U{n+l}. 

The purpose of Step A x is to change the relative frequency towards x. For x = x^' and 
r = T(j), we get |r(j)i| = \T(j)\\ = x® and \T(j) [ ™ ] \ - x& < 1 - x^ for all n > 1, 

j > 1, equality for j > n and strict inequality for j < n, since 1/n < 1/j = 1 — a;w). By the 
Dominated Convergence Theorem, criterion (ii) of Theorem 1171 is satisfied. 

(b) For the convergence criterion (ii) to fail, first let r(j)^ approach frequency 1/2, applying 
Step Ay 2 for n < a j: so that \T(j)f j] \ = 1/2 and » 1/2 for n G [2j, aj ]. Choose 
(oj) increasing with 2 > Yli>2-.ne[2i aj](Ai ~~ -^i— l) > 1 an d apply Step A^y) for n > aj. Then 
we will have |r(j)i| = x^> for all j > 2, but for all n sufficiently large, 

£ (|r(j)! nl | - x«) (A, - Aj_i) < -i £ (A, - A,-!) < -i < 0. 

j>2 i>2:ne[2i,aj] ° 

Intuitively, the approximating trees have too many even branchpoints splitting into two 
equal-sized subtrees making trees wide and small in height, while the proposed limiting 
distribution produces uneven branch points leading to thin and high trees with higher 
probability. Gromov-Hausdorff convergence fails, if total heights do not converge [8]. 
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4 Residual mass processes in regenerative tree growth processes 



Let (T n ,n > 1) be a regenerative tree growth process and (Xm , < m < M n ) the associated 
residual mass processes of label 1 in T n , n > 1, with transition probabilities 

¥ ( X [ n) =k)= Y, Pn(vr) = ^ K ({rGP:#r' n] = *;}), 1 < k < n, 

7r =(B 1 ,...,B k )eV n :#B 1 =k n 

as identified in Proposition |6l with A n = k({T G V:T^ ^ [ n ]})- The existence of a scaling limit 
for trees T° as studied in Section [3] does not imply the existence of a scaling limit for associated 
residual mass processes X^ n > , in general (see Example I24p . In this section, we study scaling 
limits X\x ntj /n -> J£* t , as n — > oo. Recall that positive decreasing self-similar Markov processes 
with Xq = 1 can be represented as Xt = exp(— £ T {(t)) f° r a subordinator £ and the 7-self-similar 
time-change r^(t) = inf{u > 0: J exp(— 7£ r )dr > i} G [0, 00], i > 0. We focus on the drift-free 
case when E(e~ s?r ) = exp(-r J", Q ^(1 - e~ sy )A(dy)). 

Given a dislocation measure k with A n = k(V \ pN) = n 7 ^(n) regularly varying, as n — > 00, 
for some 7 G (0, 1), Haas and Miermont's |13j condition (H*) requires convergence as n — > 00 of 

2 „, f („ KW (i-#^) / (#^) = / (i-|rN|)/(|rW|)« (d r). 

7r=(Bi,...,-B fc )eP„ V 

Theorem 22 For a dislocation measure k such that the first block T\ of n-a.e. T G V has an 
asymptotic frequency \Fi\ G (0, 1), define A as push-forward of k under Y (->• — log( |Ti | ) . Suppose 
1(0 00) (1 ~~ e _x ')A(dx) < 00 and X n = k(V \ pM) = n^lin). Then the following are equivalent: 

(i) Condition (H*) holds, i.e. for all bounded continuous f: [0, 1] — > [0, oo), 

/ (l-|rW|)/(|ri nI |)«(dT)^ / f(e-y)(l-e-y)A(dy), asn^oc; 

JV V y V y J(0,oo) 

(ii) J (\Y [ ? ] \ - jr x Q k{oT) ->■ 0, asn^ 00; 

(iii) / |r^| — |Ti| re(c£r) — s> 0, as n — > 00, i.e. £/te convergence — >• |Ti| ZioZds in L\(k). 
JV 

If condition (ii) holds, X^ t , jn — > Xt = exp(— 6r e (t)) ^ n distribution, as n— >oo, in the Skorohod 
sense as functions oft > 0, where £ is a subordinator withK(e~ s ^ r ) = ex.p(—r f^ 0oo ^(l—e~ sy )A(dy)). 
If, in addition, K-a.e. T G V has asymptotic ranked frequencies, then condition (H) holds and 

- — )■ 7^ jy in distribution, as n — > 00, in f/ie GHP sense. 

Proof. The proof of the equivalences is the same as for Theorem 1 17[ with jr^ | \ and |T| j replaced 
by |r^| and |Ti|. Convergence of \i/n is now an application of [13, Theorem 1], Finally, if 
/t-a.e. r G P has asymptotic ranked frequencies, we first note that in the notation of Theorem [T71 

I (l-si)u(ds)= [ (1 - \T\\)K{dT) < I (1 - |ri|)K(dT) = I (1 - e~ x )A(dx) < 00. (11) 

JSi- JV JV J(0,oo) 

To apply Theorem [T71 we verify condition (ii) of Theorem (T7) 

I (\ T [n]\l _ |p|A K(dr) = I C| r N| _ jpA K(dr) + /" CipWjj. _ |p|A K(dr) 

v y ^{|ri|=|r||} v 1 J{|r!|^|r|f} v 1 

+ / f|r [n] |t- |rt n] |) K (dT) (12) 

A|ri|=|r||} v ' 
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is a sum of three terms. The first term vanishes as n —> oo by (iii). The second term vanishes 
as n —7- oo since «(|ri| ^ |r|j) < oo: if [Til ^ |r|j, then one of them must be less than 1/2, so 
K(|ri| \T\\) < k(\Ti\ < 1/2) + k(\F\\ < 1/2) < oo, by CD]). The third term is non-negative, so 
that lim inf n ^oo LHS > in ([12]). In 



/ (|r>]jj_ \r\\) K{dr) < [ (i- \v 1 \)K(dr)+ f (\rW\\-\r\\) K (dr) 

we can make the first term small by choosing m large and the second term vanishes as n — > oo, 
for each fixed m > 1. Hence, limsup,^^ LHS < and so lim n _ 5 . 00 LHS = 0. □ 



From the proof of Theorem [22] we can also derive a converse to the last part of Theorem [ 
Corollary 23 In the setting of Theorem 17, the block Ti containing 1 of K-a.e. T £ "/-^{In} has 



an asymptotic frequency in (0, 1). With A as in Theorem \2SX the following are equivalent: 

(i) / (1 - e- x )A(dx) < oo and [ (\T^ ] \ - \Ti\) K (dT) -»• 0; 

(ii) «(|ri| ^ |r|j) < oo and [ (\T [n] \\ - \T [ ^\) k(oT) -> 0. 

A|ri|=|r||} v 1 

If condition (ii) holds, t ^ /n — > Xt = exp(— £, T( (t)) i- n distribution, as n— >oo, in the Skorohod 
sense as functions oft > 0, where £ is a subordinator with E(e _s ^ r ) = exp(— r f, oo ^(l—e~ sy )A(dy)). 

It is not hard to check that these conditions are satisfied in previous examples, in which tree 
convergence and residual mass process convergence are known to hold, including the exchangeable 
case |15[ Proposition 7], the particle labelled 1 in the restricted exchangeable case Proposition 
28] and particle 1 in the alpha-theta model [231 Proposition 6(iv)]. Let us provide an example 
that satisfies tree convergence in distribution and the first extra condition At( |Ti | ^ |r|{) < oo, 
but not the second extra condition in (ii), and hence also fails the second extra condition in (i). 

Example 24 In the general setting of Example [HI consider T(j)^ that first approaches (the 
wrong!) frequency 1 — applying Step A 1 _ x ( j ) for n < aj, so that |r(j)^ n '| ~ 1 — x^ for 
n € [i/(l — x^'),aj]. Then we apply Step A x (j) for n > aj to achieve |r(j)i| = x^>. We call 
these partitions "evil". If we did this for all j > 2, too many partitions would have intermediate 
frequencies around 1/2 when restricted to [n] and tree convergence may fail. Note that while at 
1 — x^\ the block not containing 1 has frequency x^' and is the larger block size that appears in 
the tree convergence criterion, while frequency 1 — x^' is relevant for the residual mass process. 

To control the influence of partitions at intermediate frequencies, we also consider "good" 
partitions from Example l2"TTa) . The following strategy gives the right mix of "good" and "evil" : 

1. For j = 2 and j = 3, start with two evil partitions T(2) and T(3), with |r(3)^)| ~ 1 — x^) 
for £ = 3/ (1 — x^), but leave ai and 03 to be specified. Take good partitions T(4), . . . , V{€). 
Also recall from the general setting that A3 — A2 = 73 7_1 > 0. To proceed inductively, let 
m = 1, Ei = {2, 3}, ji = £ + 1, and proceed to step 2. 

2. Given (m, E m , j m ), release the smallest evil partition e m = min£' m by setting a em = j m . 
Start evil partitions T(j m ), ... ,T(k m ) up to k m = mi{j>j m :X j -\ jm > A em -A £m _i}. Let 

£ m = mf [j > k m : |r(e m )f' ] | « x^ and |r(#| « 1 - x®,j m < i < k m } , 

and take good partitions T(k m + 1), . . . ,T(£ m ). Now set E m+ i = (E m \{e m })U{j m , . . . , k m }, 

3m-\-l 

+ 1 and repeat step 2. for (m + 1, E m+ i, j m +i). 
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Now |r(j)i | = for all j > 2 since a em < oo for all evil partitions e m . Criterion (ii) of Theorem 
[17] for tree convergence holds, because the good partitions and the evil partitions that are either 
at frequency x^ or 1 — x^' give convergence as in Example 121( a) . while the evil partitions at 
intermediate frequencies have total weight w m = (A em — A em _i) + (Afc m — A Jm _i) — > as m — > oo, 
so their contribution vanishes as m — > oo. 

Criterion (ii) of Theorem [22] for residual mass process convergence is not satisfied, because 
for every n > 3/(1 — x^), there are evil partitions of weight at least A3 — A2 which have a 
frequency |r(j)^| ~ 1 — x^ that is smaller by more than 1/4 than their limit frequency x^\ 
since x^' — (1 — x^') > 1/4 for all j > 3, and this cannot be offset by partitions that exceed their 
limit frequencies, by the argument in Example 121(a). 

5 Further problems and possible future work 

Due to the coupling of (T n ,n > 1) in a regenerative tree growth process, the convergence in 
distribution in Theorems H7J and [22] should be strengthened to a convergence in probability or 
even to almost sure convergence in all cases discussed here. We have proved tree convergence 
in probability in the exchangeable case |15| . and in the restricted exchangeable case [5] provided 
that Vj = u m , j >m, but the general case including the alpha-theta model remains open. 

In the alpha-theta model [23] and the (restricted) exchangeable [15^ [5] cases, we have es- 
tablished a two-stage almost sure convergence to a self-similar tree T by passing via reduced 
subtrees of T n and of T spanned by the first k labelled leaves and letting first n — > 00 and then 
k — > 00. More specifically, we have embedded (T n , n > 1) in T as discrete trees with edge lengths. 
We have studied the evolution of reduced subtrees of T spanned by the first k labelled leaves, 
equipped with a projected mass measure, as k increases, in the alpha-theta case. We refer to the 
evolution of embedded trees as weighted trees with an (atomic) measure on the branches as bead 
crushing [23], see also [20] for related structures. There are several interesting variants of bead 
crushing processes and connections to transformations of trees. We aim to carry out a programme 
extending the alpha-theta case to general binary regenerative tree growth processes. 

The basic embedding problem is to find a random leaf in a self-similar tree that induces a 
given residual mass process. Another interesting structure is the joint distribution of two residual 
mass processes (see [23]). When embedded in the same tree, they coincide up to a branch point 
and then evolve independently. We propose the terms fragmenter for exponential subordinators 
(e~^ s , s > 0) whose self-similar time changes are residual mass processes, and bifurcator for pairs 
of fragmenters that coincide up to an exponential time and then evolve independently. 
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